FudanNLP: A Toolkit for Chinese Natural Language Processing

نویسندگان

  • Xipeng Qiu
  • Qi Zhang
  • Xuanjing Huang
چکیده

The growing need for Chinese natural language processing (NLP) is largely in a range of research and commercial applications. However, most of the currently Chinese NLP tools or components still have a wide range of issues need to be further improved and developed. FudanNLP is an open source toolkit for Chinese natural language processing (NLP), which uses statistics-based and rule-based methods to deal with Chinese NLP tasks, such as word segmentation, part-ofspeech tagging, named entity recognition, dependency parsing, time phrase recognition, anaphora resolution and so on.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization

In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2017): Chinese News Headline Categorization. The dataset of this shared task consists 18 classes, 12,000 short texts along with corresponded labels for each class. The dataset and example code can be accessed at https://github.com/FudanNLP/ nlpcc2017_news_headli...

متن کامل

Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Segmentation for Micro-Blog Texts

In this paper, we give an overview for the shared task at the 5th CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2016): Chinese word segmentation for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. Besides, we also use a new psychometric-inspired evaluation metric for ...

متن کامل

FudanNLP at RITE 2011: a Shallow Semantic Approach to Textual Entailment

RITE is a task recognizing logic relations between texts. This paper presents FDCS’s approach on NTCIR9-RITE Chinese simplified BC & MC subtasks. Our system is built on a machine learning architecture with features selected on shallow semantic methods, including named entity recognition, date & time expression extraction, words overlap and negation concept recognition. FudanNLP is wildly used b...

متن کامل

NiuParser: A Chinese Syntactic and Semantic Parsing Toolkit

We present a new toolkit NiuParser for Chinese syntactic and semantic analysis. It can handle a wide range of Natural Language Processing (NLP) tasks in Chinese, including word segmentation, partof-speech tagging, named entity recognition, chunking, constituent parsing, dependency parsing, and semantic role labeling. The NiuParser system runs fast and shows state-of-the-art performance on sever...

متن کامل

Chinese Web Scale Linguistic Datasets and Toolkit

The web provides a huge collection of web pages for researchers to study natural languages. However, processing web scale texts is not an easy task and needs many computational and linguistic resources. In this paper, we introduce two Chinese parts-of-speech tagged web-scale datasets and describe tools that make them easy to use for NLP applications. The first is a Chinese segmented and POS-tag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013